Feature Search Patterns Guide
This guide explains the pattern matching system used for feature searching in the Excel import process. Feature search patterns allow you to identify and extract specific data based on cell formatting, content, and location.
Pattern Syntax
Feature patterns use a special syntax that combines multiple criteria using asterisks (*) as separators:
FEATURE_TYPE*FORMAT_CODE*POSITION*COLUMN
Components
FEATURE_TYPE
: The type of feature to search for (e.g., FONT_COLOR, BORDER_RIGHT)FORMAT_CODE
: The specific format to match (e.g., FF000000 for black color)POSITION
: The relative position to search (e.g., FIRST, LAST, or numeric index)COLUMN
: The column index to search in (optional)
Feature Types
Font-Related Features
FONT_COLOR
: Matches text colorFONT_BOLD
: Matches bold textFONT_ITALIC
: Matches italic textFONT_SIZE
: Matches specific font sizes
Border Features
BORDER_RIGHT
: Matches right border propertiesBORDER_LEFT
: Matches left border propertiesBORDER_TOP
: Matches top border propertiesBORDER_BOTTOM
: Matches bottom border properties
Cell Features
BACKGROUND_COLOR
: Matches cell background colorCELL_TYPE
: Matches cell data typeCELL_FORMAT
: Matches cell format codes
Pattern Examples
Basic Patterns
{
"FONT_COLOR*FF000000*FIRST*15": "Black text in first row, column 15",
"BORDER_RIGHT*THIN": "Thin right border anywhere",
"BACKGROUND_COLOR*FFFFFF*LAST": "White background in last row"
}
Complex Patterns
Multiple patterns can be combined using search types:
{
"TYPE": "OR_ARRAY",
"FEATURES": [
"FONT_COLOR*FF000000*FIRST*15",
"FONT_COLOR*FF000000*FIRST*17",
"FONT_COLOR*FF000000*FIRST*19"
]
}
Search Types
Single Pattern Search
Used for simple feature matching:
{
"TYPE": "CONDITION",
"FEATURE": "TopBorder.Thin"
}
OR Search
Matches if any pattern matches:
{
"TYPE": "OR",
"FEATURES": [
"FONT_COLOR*FF000000*FIRST*14",
"FONT_COLOR*FF000000*2*14"
]
}
AND Search
Requires all patterns to match:
{
"TYPE": "AND",
"FEATURES": [
"BORDER_RIGHT*THIN",
"FONT_BOLD*TRUE"
]
}
Feature Label Search
Special configuration for searching labeled features:
"FeatureLabelSearch": {
"FRANJAS": {
"SOURCE_FEATURE": "PAGE_SPLIT_TYPE",
"TYPE": "OR",
"ROW_THEN_COLUMN": "true",
"FEATURES": "FONT_COLOR*FFFF0000*FIRST",
"MAX_COLUMN": 10
}
}
Format Codes
Color Codes
FF000000
: BlackFFFF0000
: RedFF0000FF
: BlueFF00FF00
: GreenFFFFFF00
: Yellow
Border Styles
THIN
: Thin borderMEDIUM
: Medium borderTHICK
: Thick borderDOUBLE
: Double line border
Search Locations
Position Specifiers
FIRST
: First occurrenceLAST
: Last occurrence- Numeric values (1, 2, 3, etc.): Specific occurrence
Direction Control
ROW_THEN_COLUMN
: Search row-wise firstCOLUMN_THEN_ROW
: Search column-wise first
Best Practices
-
Pattern Organization
- Group related patterns together
- Use meaningful names for pattern groups
- Document complex pattern combinations
-
Performance Optimization
- Use specific column references when possible
- Limit search ranges where appropriate
- Combine related patterns using OR/AND operations
-
Maintainability
- Use ReusableFormats for common patterns
- Document color codes and special formats
- Keep pattern structure consistent
Common Pattern Use Cases
Header Detection
{
"TYPE": "AND",
"FEATURES": [
"FONT_BOLD*TRUE*FIRST",
"BACKGROUND_COLOR*FF000000*FIRST"
]
}
Data Region Borders
{
"TYPE": "OR",
"FEATURES": [
"BORDER_RIGHT*THIN",
"BORDER_LEFT*THIN"
]
}
Special Cell Formatting
{
"TYPE": "AND",
"FEATURES": [
"FONT_COLOR*FF0000FF",
"FONT_BOLD*TRUE"
]
}
Troubleshooting
Common Issues
-
Pattern Not Matching
- Verify color codes are correct
- Check position specifiers
- Confirm column numbers
-
Multiple Matches
- Use more specific patterns
- Add position constraints
- Consider using AND conditions
-
Performance Issues
- Limit search ranges
- Use specific column references
- Optimize pattern combinations
Pattern Testing
Test patterns incrementally:
- Start with basic patterns
- Add complexity gradually
- Verify each component separately
- Combine patterns carefully